-
Notifications
You must be signed in to change notification settings - Fork 94
Add get_model_lineage_dev CLI tool
#420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a48b7a2 to
76ac932
Compare
|
Thanks Vince! I am wondering if we should not create new tools instead of making Here are my Pros/Cons of having new tools for
So, I am in the camp of adding this functionality but in new tools ; and I'd be keen to hear other people's opinions about it. As a side note, would you be able to set up signed commits for this repo? We can bypass this check at the PR level as repo admins, but this repo expects all commits to be signed now. |
|
@b-per Crap, that's how I originally had it (shouldn't have squashed 😢 ) I didn't like it because when I was trying it out, exactly like you pointed to, it seemed to arbitrarily pick which tool it used. So I could run very similar queries and it would give two different results. We could give better names/descriptions so the tool wouldn't get confused but I don't think the user would necessarily know if they were getting the answer they expected.
I think it would just run the tool and the user would see it asking to run If it were to be two separate tools, would you think it should be a cli tool or a discovery tool. My entire thought process was "Discovery is NOT just platform", especially after listening to Jason's talk at Coalesce where he talked about them abstractly. This very much relates to #418 in my head. Rebased with gpgSign on, no idea why it was set to false for this repo. Also on this
|
76ac932 to
c70b52e
Compare
|
Hey @VDFaller and @b-per, I'm sorry for the back-and-forth on this. I told Vince that I was appreciative of sticking with the existing I like the router approach because the agent typically doesn't care whether the information is coming from a local or remote source; the user cares more about that. Also, with the latest config changes, it is quite easy to point the agent to local or remote. If Furthermore, this router approach can be applied to the Semantic Layer tools in the future. It is a source of frustration for some users that these tools don't work locally. |
Add a fallback path for Discovery tools to get use CLI functionality Add ModelLineage type with main constructor `from_manifest` The CLI path will not work until auto-disable is functioning correctly.
c70b52e to
adf92cc
Compare
add endpoint args to the tool.
08b5157 to
c041c1a
Compare
get_model_lineage_dev CLI tool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new get_model_lineage_dev CLI tool that enables non-cloud customers to retrieve model lineage information by parsing the local development manifest, supporting upstream, downstream, or bidirectional lineage queries with optional recursive traversal and test filtering.
Key Changes:
- Implemented
ModelLineagedata model with support for recursive parent/child traversal and cycle detection - Added
get_model_lineage_devtool to the dbt CLI toolset with configurable direction, recursion, and exclusion options - Updated existing discovery prompts to clarify distinction between production and development lineage tools
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/dbt_mcp/dbt_cli/models/lineage_types.py | New module implementing ModelLineage, Ancestor, and Descendant models with manifest parsing logic |
| src/dbt_mcp/dbt_cli/tools.py | Added get_model_lineage_dev function and _get_manifest helper to load and parse manifest.json |
| tests/unit/dbt_cli/test_model_lineage.py | Comprehensive test coverage for lineage parsing with various scenarios |
| src/dbt_mcp/tools/tool_names.py | Registered GET_MODEL_LINEAGE_DEV tool name |
| src/dbt_mcp/tools/toolsets.py | Added new tool to DBT_CLI toolset |
| src/dbt_mcp/tools/policy.py | Defined tool policy as METADATA behavior |
| src/dbt_mcp/prompts/dbt_cli/get_model_lineage_dev.md | Documentation for the new tool with usage examples |
| src/dbt_mcp/prompts/discovery/get_model_parents.md | Clarified this tool is for production manifest |
| src/dbt_mcp/prompts/discovery/get_model_children.md | Clarified this tool is for production manifest |
| README.md | Added get_model_lineage_dev to CLI tools list |
| .changes/unreleased/Enhancement or New Feature-20251203-203944.yaml | Changelog entry for the feature |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
| def get_model_lineage_dev( | ||
| model_id: str, | ||
| direction: Literal["parents", "children", "both"] = "both", | ||
| exclude_prefixes: tuple[str, ...] = ("test.", "unit_test."), | ||
| *, | ||
| recursive: bool, | ||
| ) -> dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: I think this should have essentially the same function signature as the other lineage tool: https://github.com/dbt-labs/dbt-mcp/pull/461/files#diff-6d91f0721d8dcde8199de504338811a7063757ec13f32eca508bfbc8b663a54bR390-R396
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make a separate PR once they're both in to align them. Cool?
| import json | ||
|
|
||
| _run_dbt_command(["parse"]) # Ensure manifest is generated | ||
| cwd_path = config.project_dir if os.path.isabs(config.project_dir) else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: when the server starts up, we should make all paths absolute if they aren't already.
| ] | ||
| else: | ||
| # Build nested descendant trees. Prevent cycles using path tracking. | ||
| def _build_descendant(node_id: str, path: set[str]) -> Descendant: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we could consolidate the two versions of this function into one. The differences are minimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had to do some funky casting to make mypy happy. let me know if you feel like it's better.
refactor ModelLineage.from_manifest for readability
Summary
Added a tool for parsing the manifest to get the lineage.
What Changed
Just parses then reads the manifest.
Why
So that it's usable by non-cloud customers.
Checklist
Additional Notes
There are some differences in the outputs. I can try to line them up if needed.
Prompt Call & Response
jqon the manifest